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Engineering fungi for the utilisation of L-arabinose 
Field of the invention 

The present invention relates to a genetically modified fungus and its use for the 
5 production of useful products such as ethanol, lactic acid, xylitol and the like from 
materials containing the pentose sugar L-arabinose. 
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Background of the invention 

L-arabinose is a major constituent of plant material, L-arabinose fermentation is 
10 therefore also of potential biotechnological interest. 

Fungi that can use L-arabinose and D-xylose are not necessarily good for industrial 
use. Many pentose utilising yeast species for example have a low ethanol tolerance, 
which makes them unsuitable for ethanol production. One approach would be to 
improve the industrial properties of these organisms. Another is to give a suitable 

15 organism the ability to use L-arabinose and D-xylose. There are pathways for ID- 
xylose and L-arabinose, which are known to be active in bacteria. For D-xylose 
catabolism it is a xylose isomerase, which converts D-xylose to D-xylulose and a 
xylulokinase to make D-xylulose 5-phosphate. For L-arabinose catabolism the 
pathway consists of an isomerase, a kinase and an epimerase which convert L- 

20 arabinitol to L-ribulose, L-ribulose 5-phosphate and D-xylulose 5-phosphate, with 
D-xylulose 5-phosphate being an intermediate of the pentose phosphate pathway 
(Stryer, 1988). It has been tried to overexpress this bacterial pathway in the yeast 5. 
cerevisiae, but it was not functional. The three enzymes of the L-arabinose pathway 
were expressed and shown to be active, however no growth on L-arabinose as a sole 

25 carbon source was reported (Sedlak and Ho, 2001). Also the expression of xylose 
isomerase in a fungal host was not successful (Sarthy et al. 1987, Chan et al. 1989, 
Kristo et al. 1989, Moes et al 1996, Schriinder et al. 1996). The reason for this is not 
clear. There might be a species barrier, which prevents these bacterial isomerases to 
work in fungi. It can also be metabolic imbalances in the host, which are solved by 

30 an unknown mechanism in the donor. 

There is also a hypothetical eukaryotic, i.e. fungal pathway, where L-arabinose is 
also converted to D-xylulose 5-phosphate, but by a different pathway (see figure 1). 
This pathway has been suggested to use 2 reductases, 2 dehydrogenases and a 
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kinase as shown (Chiang and Knight, 1961, Witteveen et al., 1989). While the genes 
of the bacterial pathway have been known for decades, very little is known about 
this hypothetical fungal pathway. 

A fungal pathway for L-arabinose utilisation was described by Chiang and Knight 
5 (1961) for Penicillium chrysogenum and by Witteveen et al. (1989) for Aspergillus 
niger. It consists of an NADPH-linked reductase, which forms L-arabinitol, an 
NAD-linked dehydrogenase which forms L-xylulose, an NADPH- linked reductase 
which forms xylitol, an NAD-linked dehydrogenase which- forms D-xylulose and a 
xylulokinase. The final product is D-xylulose 5-phosphate as in the bacterial L- 

10 arabinose pathway (see figure 1). This pathway was described only for filamentous 
fungi, but there are indications that it may also occur in yeast. Shi et al. (2000) 
described a mutant of Pichia stipitis which was unable to grow on L-arabinose. 
Over-expression of the NAD-linked xylitol dehydrogenase could restore the growth 
on L-arabinitol indicating that xylitol may be an intermediate in the L-arabinose 

15 pathway. Also yeast strains, which had L-arabinose as a sole carbon source, 
produced L-arabinitol and small amounts of xylitol (Dien et al., 1996), indicating 
r that yeast might use this pathway. The capability of L-arabinose fermentation is not 
a common feature of yeast. Many yeast species mainly accumulate the L-arabinitol 
formed from L-arabinose (McMillan and Boynton 1994). Only recently yeast 

20 species were identified which were capable of L-arabinose fermentation (Dien et al., 
1996). 

The hypothetical fungal L-arabinose pathway has similarities to the fungal D-xylose 
pathway. In both pathways the pentose sugar goes through reduction and oxidation 
reactions where the reductions are NADPH-linked and the oxidations NAD-linked. 

25 D-xylose goes through one pair of reduction and oxidation reaction and L-arabinose 
goes through two pairs. The process is redox neutral but different redox cof actors, 
i.e. NADPH and NAD are used, which have to be separately regenerated in other 
metabolic pathways. In the D-xylose pathway an NADPH-linked reductase converts 
D-xylose into xylitol, which is then converted to D-xylulose by an NAD-linked 

30 dehydrogenase and to D-xylulose 5-phosphate by xylulokinase. The enzymes of the 
D-xylose pathway can all be used in the L-arabinose pathway. The first enzyme in 
both pathways is an aldose reductase (EC 1.1.1.21). The corresponding enzymes in 
Saccharomyces cerevisiae (Kuhn et al. 1995) and Pichia stipitis (Verduyn, 1985) 
have been characterised. They are unspecific and can use either L-arabinose or D- 

35 xylose with approximately the same rate to produce L-arabinitol or xylitol 
respectively. Genes coding for this enzyme are known e.g. for Pichia stipitis 
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(Amore et al., 1991), Saccharomyces cerevisiae (Kuhn et al., 1995, Richard et al. 
1999), Candida tenius (Hacker et al., 1999), Kluyveromyces lactis (Billard et aL, 
1995) and Pachysolen tannophilus (Bolen et al., 1996). 

The xylitol dehydrogenase (also known as D-xylulose reductase EC 1.1.1.9) and 
5 xylulokinase EC 2.7.1.17 are the same in the D-xylose and L-arabinose pathway of 
fungi. Genes for the D-xylulose reductase are known from Pichia stipitis (Kotter et 
al. 1990) Saccharomyces cerevisiae (Richard et al. 1999) and Tricoderma reesei 
(Wang et al. 1998). The gene for a fungal xylulokinase is only known for 
Saccharomyces cerevisiae (Ho and Chang, 1998) 

10 Genes coding for L-arabinitol 4-dehydrogenase -(EC 1.1. 1.1 2) or L-xylulose 
reductase (EC 1 . 1 . 1 . 1 0) are not known. 

The invention aims to be able to express the pathway for L-arabinose utilisation in 
fungi. The hypothetical fungal pathway expressed in Saccharomyces cerevisiae 
would result in a strain, which can ferment nearly all sugars from forestry and 
15 agricultural waste to ethanol. 

Summary of the invention 

According to the invention, this problem is solved by genetically modifying fungus, 
which is characterised in that it has been transformed by a gene for L-arabinitol 4- 
20 dehydrogenase or a gene for L-xylulose reductase or both such genes. 
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According to the present invention, a fungus is transformed with all or some of the 
genes coding for the enzymes of the L-arabinose pathway, i.e. aldose reductase, L- 
arabinitol 4-dehydrogenase, L-xylulose reductase, D-xylulose reductase and 
xylulokinase. The resulting fungus is then able to utilise L-arabinose. We disclose 
i 25 genes for L-arabinitol dehydrogenase and L-xylulose reductase. We disclose that 
% when a fungus as 5. cerevisiae that is unable to utilise L-arabinose is transformed 

with genes for aldose reductase, L-arabinitol 4-dehydrogenase, L-xylulose 
reductase, D-xylulose reductase and xylulokinase, it becomes able to utilise L- 
arabinose. We also disclose that when a fungus, such as genetically engineered S. 
p 30 cerevisiae, that can use D-xylose but not L-arabinose is transformed with genes for 
I L-arabinitol 4-dehydrogenase and L-xylulose reductase it can utilise L-arabinose. 



4 

By the term utilisation it is meant here for example that the organism can use L- 
arabinose as a carbon source or as an energy source or that it can convert L- 
arabinose into another compound that is a useful substance. 
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5 Brief description of the drawings 

Figure 1. The hypothetical fungal and the bacterial pathway for L-arabinose 
utilisation. 

Figure 2. L-arabinitol 4-dehydrogenase sequence (SEQ ID NO. 1): The sequence of 
the genomic DNA was combined with the cDNA sequences of the N-terminal and 
10 C-terminal region. The amino acid sequences in bold are from the peptide fragments 
of the purified protein. The intron sequence is underlined. 

Figure 3. Sequence of the cDNA clone and protein sequence for the L-xylulose 
reductase (SEQ ID No. 2). 



15 Detailed description of the invention 

The central teaching of this invention is to demonstrate how a fungal 
microorganism can be genetically engineered to utilise L-arabinose. By utilization 
we mean that the organism can use L-arabinose as a carbon source or as em energy 
source or that it can convert L-arabinose into another compound that is a useful 

20 substance. Some fungi can naturally utilise L-arabinose, others cannot. It can be 
desirable to transfer the capacity of utilising L-arabinose to a organism lacking the 
capacity of L-arabinose utilisation but with other desired features, such as the ability 
to tolerate industrial conditions or to produce particular useful products, such as 
ethanol or lactic acid or xylitol. In order to transfer the capacity of L-arabinose 

25 utilisation by means of genetic engineering it is essential to know all the genes of a 
set of enzymes that can function together in a host cell to convert L-arabinose into a 
derivative, e.g. D-xylulose 5-phosphate, that the host can catabolise and so produce 
useful products. 

One example is to genetically engineer S. cerevisiae to utilise L-arabinose. S. 
30 cerevisiae is a good ethanol producer but lacks the capacity for L-arabinose 
utilisation. Other examples are organisms with a useful feature but lacking at least 
part of a functional L-arabinose pathway. 
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An L-arabinose pathway believed to function in fungi is shown in the figure 1. 
Genes coding for the aldose reductase (EC 1.1.1.21), the D-xylulose reductase (EC 
1.1.1.9) and xylulokinase (EC 2.7.1.17) are known. In order to construct a strain 
that can use L-arabinose by this hypothetical pathway, two additional genes would 
5 be required, i.e. genes for L-arabinitol 4-dehydrogenase (EC 1.1.1.12) and for L- 
xylulose reductase (EC 1.1.1.10). 

L-arabinitol 4-dehydrogenase: An L-arabinitol 4-dehydrogenase was described for 
Penicillium chrysogenum and Aspergillus niger by Chiang and Knight (1960) and 
Witteveen et al (1989) respectively. This enzyme converts L-arabinitol and NAD to 
10 L-xylulose and NADH. It was also reported to have activity with NAD and adonitol 
(ribitol) and NAD and xylitol (Chiang and Knight, 1960). 

L-xylulose reductase: The L-xylulose reductase (EC 1.1.1.10) converts xylitol and 
NADP to L-xylulose and NAD. Another enzyme, which has been reported to 
catalyse the same reaction, is the D-iditol 2-dehydrogenase (EC 1.1.1.15) (Shaw, 
15 1956). 

f L-xylulose reductase was found in ErwinM uredovora (Doten et al, 1985), 
Aspergillus niger (Witteveen et al. 1994) and guinea pig (Hickman and Ashwell, 
1959). A preparation from pigeon liver is commercially available (Sigma- Aldrich). 
A single subunit of the enzyme from Aspergillus niger has a molecular weight 32 
Y: 20 kDa, the native enzyme an estimated weight of 250 kDa (Witteveen et al. 1994). 
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However, the amino acid sequences and the encoding genes are not known for any 
L-arabinitol dehydrogenase or L-xylulose reductase. We now disclose such genes. 
We also disclose that transforming these genes into a fungus that cannot utilise L- 
arabinose but can utilise xylose confers the ability to utilise L-arabinose upon the 

Cf 

J 25 transformed fungus. 

ol To identify the genes for L-arabihitol 4-dehydrogenase or L-xylulose reductase 

different approaches are possible and a person knowledgeable in the art might use 
different approaches. One approach is to purify the protein with the corresponding 

i activity and use the information about this protein to clone the corresponding gene. 

[t 30 This can include the proteolytic digestion of the purified protein, amino acid 
sequencing of the proteolytic digests and cloning a part of the gene by PCR with 

\ primers derived from the amino acid sequence. The rest of the DNA sequence can 

then be obtained in various ways. One way is from a cDNA library by PCR using 
primers from the library vector and the known part of the gene. Once the complete 



sequence is known the gene can be amplified from the cDNA library and cloned 
into an expression vector and expressed in an heterologous host. This is a useful 
strategy if screening strategies or strategies, which are based on homology between 
sequences, are not suitable. 

5 Another approach to clone a gene is to screen a DNA library. This is especially a 
good and fast procedure, when overexpression of a single gene causes a phenotype, 
which is easy to detect. Now that we have disclosed that transformation of a xylose- 
utilising fungus with genes encoding L-arabinitol dehydrogenase and L-xylulose 
reductase confers the ability to grow in L-arabinose, another strategy to find the 
10 genes for L-arabinitol 4-dehydrogenase and L-xylulose reductase is the following: 
One of the two enzymes is purified and the corresponding gene is cloned. Now all 
the genes of the pathway, except one, are known. In this situation a screening 
strategy is suitable to find the last gene of the pathway. A strain with all the genes 
of the pathway except one can be constructed, transformed with a DNA library, and 
15 screened for growth on L-arabinose. In this strategy one can first purify the L- 
arabinitol 4-dehydrogenase and then screen for the L-xylulose reductase or first 
^ purify the L-xylulose reductase and theru screen for the L-arabinitol 4- 
dehydrogenase. 

There are other ways and possibilities to clone these genes: 
i j 20 One can purify both enzymes and find the corresponding genes. 

o 

9 1 One can screen a DNA library or a combination of two DNA libraries to find both 

genes at once. 
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One can use other screens to find the individual genes. 

One could screen for example for growth on L-xylulose to find the L-xylulose 
25 reductase and then for growth on L-arabinose or L-arabinitol to screen for L- 
arabinitol 4-dehydrogenase. 

Other possible screens could make use of the cofactor requirements, e.g. in a 
screening condition which is lethal because of NADPH depletion one could screen 
for a L-xylulose reductase in the presence of xylitol. 
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One can screen existing databanks for genes with homology to genes from related 
protein families and test them if they have the desired enzyme activity. 
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For a person skilled in the art there are different ways to identify the gene, which 
codes for a protein with the desired enzyme activity. The methods described here 
illustrate our invention, but any other method known in the art may be used 

Once all the genes of the L-arabinose pathway are identified, this pathway can be 
5 introduced to a new host organism, which is lacking this pathway. It is not always 
necessary to introduce all the genes. It might be that the host organism has already 
part of the pathway. For example a fungus that can utilise D-xylose might only 
require the enzymes that convert L-arabinitol to xylitol. Expression of L-arabinitol 
4-dehydrogenase and L-xylulose reductase would then be sufficient to complete the 
10 L-arabinose pathway. 

The L-arabinose pathway can be introduced to S. cerevisiae to generate a strain, 
which is a good ethanol producer and can utilise the pentoses L-arabinose and D- 
xylose. In such a strain the most abundant hexose and pentose sugars can be 
fermented to ethanol. 

15 In Examples 3 and 5 the genes were cloned into a genetically engineered laboratory 
strain of S. cerevisiae. The same approach can^e used with an industrial strain of S. 
cerevisiae, e.g. a brewer's, distiller's or baker's yeast. Industrial yeasts have process 
advantages such as high ethanol tolerance, tolerance of other industrial stresses and 
rapid fermentation. They are normally polyploid and their genetic engineering is 
20 more difficult compared to laboratory strains, but methods for their engineering are 
known in the art. Other yeasts unable or inefficient to utilise L-arabinose could be 
used as hosts, e.g. Schizosaccaromyces pombe or Pichia spp., Candida spp., 
Pachysolen spp., Schwanniomyces spp., Arxula, spp., Trichosporon spp., Hansenula 
spp. or Yarrowia spp. But our invention is not restricted to yeast nor even to fungi, 
oo 25 It can be practised with any microorganism unable or inefficient to use L-arabinose. 
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In Examples 3 and 5 we used a TPI promoter from S. cerevisiae for the expression 
of L-arabinitol 4-dehydrogenase and the PGK promoter from S. cerevisiae for the 
£ expression of L-xylulose reductase. Both promoters are considered strong and 

« 30 constitutive. Other promoters, which are stronger or less strong, can be used. It is 
also not necessary to use a constitutive promoter. Inducible or repressible promoters 
can be used, and may have advantages, for example if a sequential fermentation of 
different sugars is desired. 
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In our example we used two plasmids for the two genes L-arabinitol 4- 
dehydrogenase and L-xylulose reductase. Each plasmid contained a different 
selection marker. These genes can also be expressed from a single plasmid with or 
without a selection marker or they can be integrated into the chromosomes. The 
5 selection markers were used to find successful transformations more easily and to 
stabilise the genetic construct. The yeast strain was transformed successively with 
the different genes and the transformation to S. cerevisiae was performed with the 
lithium acetate procedure (Gietz et al. 1992). This is only one method to accomplish 
the desired genetic construct. All the necessary genes can be transformed 
10 simultaneously or in succession. Other transformation procedures are known in the 
art, some being preferred for a particular host, and they can be used to achieve our 
invention. 

In Examples 2 and 4 are disclosed the nucleotide sequences (SEQ ID NOs 1 and 2, 
respectively) of T. reesei genes encoding L-arabinitol dehydrogenase and L- 
15 xylulose reductase. These are suitable genes for practising our invention as is 
disclosed in Examples 5 and 6. It is well known that genes from different organisms 
encoding enzymes with the same catalytic activity have sequence similarities and 
these similarities can be exploited in many ways by those skilled in the art to clone 
other genes from other organisms with the same catalytic activity. Such genes are 
20 also suitable to practise our invention. It is also well known that many small 
variations in the nucleotide sequence of a gene do not significantly change the 
catalytic properties of the encoded protein. For example, many changes in 
nucleotide sequence do not change the amino acid sequence of the encoded protein, 
whereas many changes in amino acid sequence do not change the functional 
25 properties of a protein, in particular they do not prevent an enzyme from carrying 
3 out its catalytic function. We call such variations in the nucleotide sequence of 

2 DNA molecules "functionally equivalent variations" because they do not 

significantly change the function of the gene to encode a protein with a particular 
i function, e.g. catalysing a particular reaction. DNA molecules that are functionally 

: 30 equivalent variations of the molecules defined by SEQ ID NOs 1 and 2 can be used 
to practise our invention. 



Sometimes organisms contain genes that are not expressed under conditions that are 
useful in biotechnological applications. For example, although it was once generally 
believed that 5. cerevisiae cannot utilise xylose and it was therefore expected that S. 
35 cerevisae did not contain genes encoding enzymes that would enable it to use 
xylose it has nevertheless been shown that S. cerevisiae does contain such genes 
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(Richard et al 1999). However, these genes are not usually expressed adequately. 
Thus, another aspect of our invention is to identify genes for L-arabinitol 4- 
dehydrogenase or L-xylulose reductase or both in a host organism itself and to 
cause these genes to be expressed in that same organism under conditions that are 
5 convenient for a biotechnological process, such as ethanolic fermentation of L- 
arabinose-containing biomass. We disclose a method of identifying candidates for 
such normally unexpressed genes, which is to search for similarity to SEQ ID NOs 
1 and 2. A candidate gene can then be cloned in an expression vector and expressed 
in a suitable host and cell-free extracts of the host tested for appropriate catalytic 
10 activity as described in Examples 1 and 6. When the normally unexpressed gene has 
been confirmed to encode the desired enzyme, the gene can then be cloned back 
into the original organism but with a new promoter that causes the gene to be 
expressed under appropriate biotechnological conditions/ This can also be achieved 
by genetically engineering the promoter of the gene in the intact organism. 

15 In yet another aspect of the invention the genes encoding L-arabinitol 
dehydrogenase and L-xylulose reductase from a fungus, including fungi such as 
filamentous fungi that can have the ability to utilise L-arabinose, can now be easily 
identified by similarity to SEQ ID NOs 1 and 2. These genes can then be modified 
for example by changing their promoters to stronger promoters or promoters with 
20 different properties so as to enhance the organism's ability to utilise L-arabinose. 

One embodiment of this aspect is to modify these genes (and possibly also the well 
known gene encoding D-xylulose reductase) to create a fungus with an enhanced 
capacity to produce the valuable sugar alcohols, L-arabinitol and xylitol, the latter 
being a useful sweetener. For example, a fungus containing aldose reductase but 
■s o °:] 25 lacking L-arabinitol 4-dehydrogenase will convert L-arabinose to L-arabinitol and 
000 • can now be created by the steps of (1) transforming the fungus with the gene for 

aldose reductase if it lacks this enzyme and (2) deleting or disrupting the gene for L- 
0tttt o arabinitol 4-dehydrogenase by well known methods that utilise the sequence we 

»« % disclose for this gene (SEQ ID NO 1). Similarly a fungus that contains all the 

°r* 30 enzymes of the fungal pathway for converting L-arabinose to xylitol but lacks D- 
xylulose reductase will convert L-arabinose into xylitol and can now be created 
using the information we disclose in SEQ ID NOs 1 and 2 together with information 
about genes for D-xylulose reductase that is already known. 
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A fungus may not naturally have the enzymes needed for lactic acid production, or 
35 it may produce lactic acid inefficiently. In these cases expression of the gene 
encoding lactate dehydrogenase (LDH) enzyme can be increased or improved in the 
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fungus, and a fungus can then produce lactic acid more effciently (e.g. WO 
99/14335). Similarly, using methods known in the art, a fungus modified to use 
arabinose more efficiently as described in this invention can be further modified to 
produce lactic acid. 

5 The transformed fungus of the invention may be used to produce ethanol from L- 
arabinose. A host fungus is transformed with genes for L-arabinitol 4- 
dehydrogenase, L-xylulose reductase or both. The host can be any fungus that has 
no or only a limited ability to use L-arabinose but is able to ferment D-xylose. For 
example it can be a Saccharomyces cerevisiae strain that has been transformed with 

10 genes enabling it to ferment D-xylose. The genes for L-arabinitol 4-dehydrogenase 
and L-xylulose reductase can be obtained from T. reesei, as described in Examples 
2 and 4, but other genes encoding enzymes with these catalytic activities can also be 
used. Such genes are now easily found, for example from microorganisms able to 
use L-arabinose, because the sequences disclosed as SEQ ID Nos 1 and 2 can be 

15 exploited in various ways well known in the art to clone similar genes. The methods 
used to transform the host fungus and to select transformants can be the same as 
those used in Examples 3 and 5, but other methods known in the art can be used 
successfully to provide a transformed fungus according to our invention. 

The transformed fungus is then used to ferment a carbon source such as biomass 
20 comprising agricultural or forestry products and waste products containing L- 
arabinose and other fermentable sugars. The prepartion of the carbon source for 
fermentation and the fermentation conditions can be the same as those that would be 
used to ferment the same carbon source using the host fungus. However, the 
transformed fungus according to the invention consumes more L-arabinose than 
25 does the host fungus and produces a higher yield of ethanol on total carbohydrate 
than does the host fungus. It is well known that fermentation conditions, including 
preparation of carbon source and fermentation temperature, agitation, gas supply, 
nitrogen supply, pH control, amount of fermenting organism added, can be 
optimised according to the nature of the raw material being fermented and the 
30 fermenting microorganism. Therefore the improved performance of the transformed 
fungus compared to the host fungus can be further improved by optimising the 
fermentation conditions according to well established process engineering 
procedures. 

Use of a transformed fungus according to the invention to produce ethanol from 
35 carbon sources containing L-arabinose and other fermentable sugars has several 
industrial advantages. These include a higher yield of ethanol per ton of carbon 
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source and a higher concentration of ethanol in the fermented material, both of 
which contribute to lowering the costs of producing, for example, distilled ethanol 
for use as fuel. Further, the polution load in waste materials from the fermentation is 
lowered because the L-arabinose content is lowered, so creating a cleaner process. 

5 Lignocellulosic raw materials are very abundant in nature and offer both renewable 
and and cheap carbohydrate sources for microbial processing. Arabinose containing 
raw materials are e.g. various pectins and hemicellulosics (such as xylans) which 
contain mixtures of hexoses and pentoses (xylose, arabinose). Useful raw materials 
include by-products from paper and pulp industry such as spent liqour and wood 
10 hydrolysates, and agricultural by-products such as sugar bagasse, com cobs, com 
fiber, oat, wheat, barley and rice hulls and straw and hydrolysates thereof. Also 
arabanane or galacturonic acid containing polymeric materials can be utilised. 



Examples: 
>15 Example 1 



o o o 
O O O 



o a o 
o a 



v> '.J O 
O OO 
O O 



O O 11 

CI O K* 



■ 



Purification and amino acid sequencing of the L-arabinitol 4-dehydrogenase: 

Tricoderma reesei (Rut C-30) was grown in a medium containing 40 g/1 L- 
arabinose, 2 g/1 proteose peptone, 15 g/1 KH 2 P0 4 , 5 g/1 (NH^SC^, 0.6 g/1 Mg 2 S0 4 
x 7 H 2 0, 0,8 g/1 CaCl 2 x 2H 2 0 and trace elements (Mandels and Weber, 1969) at 28 

20 C, pH 4.0 and 30% dissolved oxygen in a fermenter (Chepmap CF2000). The 
fermentation was stopped when the L-arabinose was about 10 g/1. The cells were 
harvested with a plastic mesh sieve and washed with 10 mM sodium phosphate pH 
7. 500 g of the biomass was frozen in liquid nitrogen in 100 g aliquots. After 
thawing and sonifying with a tip sonifyer, DTT was added to a final concentration 

25 of 5 mM and the suspension centrifuged (Sorvall SS34, 40 min, 20 000 rpm). The 
supernatant was dialysed overnight against a 10 fold volume of buffer A: 10 mM 
sodium phosphate pH 7, 5 mM DTT. The retentate was then centrifuged (Sorvall 
SS34, 40 min, 20 000 rpm). AH steps were performed at 4 °C. The crude extract had 
a protein content of 7 g/1 and an L-arabinitol dehydrogenase activity of 0.7 nkat per 

30 mg of extracted protein. 500 ml of this crude extract was loaded to a column with 
200 ml DEAE and eluted with a linear gradient from buffer A to buffer A 
supplemented with 100 mM NaCl. The highest activity (16 nkat/mg, 5 mg/ml 
protein) eluted at about 80 mM NaCl. 
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The L-arabinitol 4-dehydrogenase activity was measured by adding the enzyme 
preparation to a buffer containing 100 mM Tris HC1 pH 9.0, 0.5 mM MgCl 2 , 2 mM 
NAD. The reaction was then started by adding L-arabinitol (or other sugars if 
specified) to a final concentration of 10 mM. The activity was calculated from the 
5 changes in NADH absorbance at 340 nm. All enzyme assays were done at 37 °C in 
a Cobas Mira automated analyser (Roche). In the reverse reaction the activity was 
measured by adding the enzyme preparation to a buffer containing 200mM NaP0 4 
pH 7.0, 0.5 mM MgCl 2 , 200 jxM NADH and 2 mM L-xylulose. The activity was 
calculated from the changes in NADH absorbance at 340 nm. 

10 The partially purified enzyme was tested for activity with other sugars. No activity 
was found with D-arabinitol. Activity was found with L-arabinitol and adonitol 
(ribitol). The activity with ribitol was about 80% of the activity found with L- 
arabinitol. No activity with either sugar was found when NADP was used as a 
cosubstrate. 

15 In the reversible reaction with L-xylulose and NADH an activity of 0.8 nkat/mg was 
found with 2mM L-xylulose at pH 7.0 compared to 6.4 nkat/mg with 10 mM L- 
arabinitol and 5 nkat/mg with 10 mM adonitol^ribitol). 

600 jil of the fraction with the highest activity after the DEAE column was then run 
on a native PAGE (12% acrylamide, BioRad). The gel was then stained in a 
20 Zymogram staining solution containing: 200 mM TrisHCl pH 9.0, 100 mM L- 
arabinitol, 0.25 mM nitroblue tetrazolium, 0.06 mM phenazine metosulfate, 1.5 mM 
NAD. 
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The only band which appeared in the staining was cut out and eluted by over-night 
incubation in 2ml 100 mM TrisCl pH 9.0, 0.1 %' SDS.lt was then concentrated to 
*J*J 25 about 80 fxl with Centricon (Amicon). 

bftieia o This gave an almost pure enzyme preparation with the major band in SDS PAGE at 

about 38 kDa. This protein was then used for amino acid sequencing of the 
proteolytic digests. The results of this sequencing were the following: 

Internal peptide sequences of the purified L-arabinitol 4 dehydrogenase: 

30 1: ATGA AIS VKPNIGVFTNPK 
2:YSNTWPR 
3: AFETS ADPK 
4:HDLWISEAEP 
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Example 2 

Cloning of the L-arabinitol 4-dehydrogenase: 

Cloning a gene fragment by using the internal amino acid sequences: 

5 The internal peptide sequences were used to design degenerative primers for PCR. 
The template in the first approach was genomic DNA from Tricoderma reesei. A 
sense DNA sequence corresponding to the amino acid fragment ATGAAISVK 
PNIGVFTNPK (primer 5384: ARCCIAAYATHGGIGTITTYACIAAYCC) 
and an anti-sense DNA sequence corresponding to the amino acid fragment AFET 
10 SADPK (primer 5285 : GGRTCIGCIGAIGTYTCRAAIGC) were used. The PCR 
conditions were: denaturation 30 s, 96 °C, annealing 30 s, first 2 times 37°C and 
then 27 times 42°C,extention 2 min at 72°C, final extention 5 min 72°C. This 
procedure gave a PCR product of about 1 kb. The resulting fragment of about lkb 
was then cloned to a TOPO vector (Invitrogen). 



15 This construct was then used for sequencing. 
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The sequence of the PCR product coded also for the remaining two peptide 
sequences (see figure 2). 

Cloning the N and C terminus from a cDNA library: 

A cDNA library in a yeast expression vector (Margolles-Clark et al. 1996) was used 
20 to clone the residual parts of the gene. In this expression vector the cDNA is located 
between a PGK promoter and terminator. To clone the part of the gene, which 
corresponds to the N-terminus of the protein a PCR reaction was carried out with 
the cDNA library as a template and one primer in the PGR promoter region and an 
antisence primer from the gene fragment of the L-arabinitol 4-dehydrogenase. 



\.J 25 Primer of the PGK promoter region: (primer 4196: TCAAGTTCTTAGATGCTT) 



Antisence primer of the gene fragment: (primer 5431: 

CCTTTCCTCCAAACTTGCTGG) 

The part of the gene, which corresponds to the C-terminus of the protein, was 
cloned in a similar way with primers from the gene fragment and an antisence 
30 primer from the PGK terminator. 
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Antisence primer of the PGK terminator region: (primer 3900: 
TAGCGTAAAGGATGGGG) 

Primer of the gene fragment: (primer 5430: CTGCATTGGGCCCATGAT) 

The PCR conditions were as described above except the annealing was 30 times at 
5 50°C 

The N terminus gave a PCR product of about 0.8 kb; the C terminus gave a PCR 
product of about 0.9 kb. The PCR products were cloned to TOPO vectors and the 
resulting vectors used for sequencing. 

With the information of the C-terminus and the N-terminus the open reading frame 
10 was then cloned by PCR from the cDNA library. The primer for the N-terminus 
contained an additional EcoRI restriction site (primer 5526: 
AGAATTCACCATGTCGCCTTCCGCAGTC). The primer for the C-terminus 
contained an additional with BamHI restriction site (primer 5468: 
ACGG ATCCTCT ACCTGGT AGC ACCTC A) . The annealing in the PCR reaction 
15 was 30 times 60.5 °C, Otherwise the condition^ were as described above. This gave 
a fragment of 1.1 kb, which was then cloned to a TOPO vector and used for 
sequencing. 

Comparing the sequences derived from genomic DNA and cDNA reveals an intron 
of 69 base pares (see Figure 2). 



5%°,, 20 The open reading frame codes for a protein with 377 amino acids and a calculated 
molecular weight of 39806 g/mol. 
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Example 3 

Expression of L-arabinitol 4-dehydrogenase in S. cerevisiae: 

25 From the TOPO vector the 1.1 kb EcoRI, BamHI fragment was ligated into the 
corresponding sites of the pYX242 vector (R&D Systems). The pYX242 is a 
multicopy yeast expression vector with a yeast TPI promoter and LEU2 for 
selection. This plasmid was then transformed to the S. cerevisiae strain CEN.PK2 
(VWlb). The recombinant yeast cells were grown on selective medium. The 

30 intracellular proteins were then extracted from the yeast cells by vortexing with 
glass beads. The extract was then analysed for L-arabinitol dehydrogenase activity. 
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We found an L-arabinitol 4-dehydrogenase activity of 0.2 to 0.3 nkat per mg of 
extracted protein. 



Example 4 

5 Screening for the L-xylulose reductase: 

To screen for an L-xylulose reductase a S. cerevisiae strain was used which 
contained the genes xylose reductase (aldose reductase EC LI. 1.21), L-arabinitol-4- 
dehydrogenase (EC LLL12), D-xylulose reductase (EC 1.1.1.9) and xylulokinase 
(EC 2.7.1.17). The aldose reductase, D-xylulose reductase and xylulokinase were 

10 integrated. This strain was constructed so that uracil and leucine could still be used 
for selection. The plasmid from example 3 with the L-arabinitol 4-dehydrogenase 
on a multicopy plasmid, was transformed to the strain with the integrated aldose 
reductase, D-xylulose reductase and xylulokinase. In this strain the uracil 
auxotrophy was still left for selection. A cDNA library from T. reesei in a yeast 
,15 expression vector with uracil marker (Margplles-Clark et al. 1996) was then 
transformed to this strain and screened for growth on L-arabinose. For screening the 
transformants were first grown on glucose plates with selection. About 750 000 
transformants were then replica plated to selective plates with 5% L-arabinose as a 
sole carbon source. Colonies, which appeared after 2 to 3 weeks, were streaked 

20 again on L-arabinose. The resulting colonies were then grown on glucose and the 
plasmids rescued. The plasmids were transformed to E. coli cells. Since both 
plasmids, the plasmid with the L-arabinitol 4-dehydrogenase and the plasmid from 
the cDNA library, contained only ampicillin resistance, we used colony PCR to 
identify the E. coli with the cDNA library plasmid. For the colony PCR we used 

25 primers of the PGK promoter and terminator region. From 4 independent clones 
which appeared in the L-arabinose screening a PCR product of 0.9 kb was obtained. 
The corresponding plasmids were then sequenced. The sequence of the cDNA is in 
the figure 3. The open reading frame codes for a protein with 266 amino acids with 
a calculated molecular weight of 28,428 Da. 

30 

Example 5 

Expression of the L-xylulose reductase: 
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The expression vector with the L-xylulose reductase obtained in example 4 was 
used. It was retransformed to the strain containing the genes xylose reductase 
(aldose reductase EC 1.1. 1.21), L-arabinitol-4-dehydrogenase (EC 1.1.1.12), D- 
xylulose reductase (EC 1.1.1.9) and xylulokinase (EC 2.7.1.17) which was also used 
5 in the example 4. As a control the empty vector cloning vector pAJ401 was 
transformed instead of the vector with the L-xylulose reductase. Transformants 
were first grown on D-glucose plates and then streaked on plates with 5% L- 
arabinose as a sole carbon source. The plates contained a carbon source and 
selective medium leaving out uracil and leucine as required for selection (Sherman 
10 et al. 1983). On the L-arabinose plates colonies appeared after 2 to 4 weeks with the 
strains with L-xylulose reductase, no colonies appeared in the control. 



Example 6 

Expression of the L-xylulose reductase under TPI promoter: 

15 The L-xylulose reductase was cloned by PCR, jusing the vector from example 5 as a 
template. The primers were (LXR-start EcoRI: 

GCCGAATTCATCATGCCTCAGCCTGTCCCCACCGCC) and (LXR-stop 
Hindlll: CGCCAAGCTTTTATCGTGTAGTGTAACCTCCGTCAATCAC). The 
conditions were as in Example 2 except that the annealing temperature was 63 °C. 
20 The PCR product was digested with EcoRI and Hindffl. The vector pXY212 (R&D 
Systems) which is an yeast expression vector with TPI promoter and contains the 
URA3 gene for selection was digested with EcoRI and Hind III. The PCR product 
was then ligated to the expression vector. The resulting vector was then transformed 
to the yeast strain CEN.PK2. The recombinant yeast cells were grown on selective 
25 medium. The intracellular proteins were then extracted from the yeast cells by 
vortexing with glass beads. The extract was then analysed for L-xylulose reductase 
activity. The activity was measured in a medium containing lOOmM TrisCl pH 9.0, 
1.6 M xylitol and 2 mM MgCl 2 . 2 mM NADP (final concentration) was added as a 
start reagent. The activity was calculated from the change in NADPH absorbance at 
30 340 nm. The assay was performed at 37° C in a Cobas Mira automated analyser 
(Roche). The activity was between 2 and 5 nkat per mg of extracted protein. 
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Claims 

1. An isolated DNA molecule, characterised in that it comprises a gene coding for 
a L-arabinitol 4-dehydrogenase. 

2. An isolated DNA molecule according to claim 1, characterised in that it has the 
sequence of SEQ ID NO. 1 or a functionally equivalent variants thereof. 

3. An isolated DNA molecule, characterised in that it comprises a gene coding for 
a L-xylulose reductase. 

4. An isolated DNA molecule according to claim 3, characterised in that it has the 
sequence SEQ ID NO. 2 or a functionally equivalent variants thereof. 

5. A genetically modified fungus, characterised in that it has been transformed by \ 
gene for L-arabinitol 4-dehydrogenase of claim 1 or 2 or a gene for L-xylulose 
reductase of claim 3 or 4 or both such genes. 

6. A genetically modified fungus, characterised in that the expression or activity 
encoded by a gene for L-arabinitol 4-dehydrogenase of claim 1 or 2 or a gene for L- 
xylulose reductase of claim 3 or 4 or both such genes has been modified. 

7. A genetically modified fungus, characterised in that it expresses one or more 
genes of the fungal L-arabinose pathway at a higher level than does the 
corresponding unmodified fungus under the same conditions and is able to utilise L- 
arabinose faster than the corresponding untransformed fungus. 

8. A genetically modified fungus according to claim 7, characterised in that it has 
been transformed by a gene for L-arabinitol 4-dehydrogenase of claim 1 or 2 or a 
gene for L-xylulose reductase of claim 3 or 4 or both such genes. 

9. A genetically modified fungus according to claim 7, characterised in that the 
expression or activity encoded by a gene for L-arabinitol 4-dehydrogenase of claim 
1 or 2 or a gene for L-xylulose reductase of claim 3 or 4 or both such genes has 
been modified. 

10. A genetically modified fungus according to any of claims 5 to 9, characterised 
in that it has an improved ability to utilise L-arabinose for growth. 

1 1 . A genetically modified fungus according to any of the claims 5 to 10, 
characterised in that it produces useful products. 
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12. A genetically modified fungus according to any of claims 5 to 1 1, 
characterised in that it produces intermediates in the pentose phosphate pathway or 
in the fungal L-arabinose pathway, or ethanol, lactic acid, xylitol or the like. 

13. A genetically modified fungus according to any of claims 5 to 12, 
5 characterised in that it the fungus is a yeast. 

14. A genetically modified fungus according to claim 13, characterised in that the 
yeast is a strain of Saccharomyces species, Schizosaccharomyces species, 
Kluveromyces species, Pichia species, Candida species or Pachysolen species. 

15. A genetically modified fungus according to claim 14, characterised in that the 
10 strain is a genetically engineered strain of 5. cerevisiae. 

16. A genetically modified fungus according to any of claims 5 to 12, 
characterised in that the fungus is a filamentous fungus. 

17. A genetically modified fungus according to claim 16, characterised in that the 
strain is a genetically engineered strain of Aspergillus species, Trichoderma species, 

" 15 Neurospora species, Fusarium species, PenicilMum species, Humicola species, 
Tolypocladium geodes, Trichoderma reesei (Hypocrea jecorina), Mucor species, 
Trichoderma longibrachiatum, Aspergillus nidulans, Aspergillus niger or 
Aspergillus awamori. 

18. A genetically modified fungus according to any of claims 5 to 17, 

20 characterised in that the fungus prior modification is able to utilise D-xylose. 

19. A method of producing useful products from biomass containing L-arabinose, 
characterised in that the product is produced from said biomass by a genetically 
modified fungus of one of claims 5 to 18. 

20. A method according to claim 19, characterised in that the useful product is 
25 ethanol, lactic acid or xylitol. 



Abstract 

A fungal microorganism can be engineered by means of 
genetic engineering to utilise L-arabinose. The genes of the 
L-arabinose pathway, which were unknown, i.e. L- 
arabinitol 4-dehydrogenase and L-xylulose reductase, were 
identified. These genes, together with the known genes of 
the L-arabinose pathway, form a functional pathway. This 
pathway can be introduced to a fungus, which is 
completely or partially lacking this pathway. 



Fungal Pathway: Bacterial pathway: 



L-arabinose 




aldose reductase 
EC 1.1.1.21 



L-arabinitol 




L-arabinitol 
4-dehydrogenase 
EC 1.1.1.12 



L-xylulose 




L-xylulose reductase 
EC 1.1.1.10 



xylitol 




D-xylulose reductase 
ECl.1.1. 9 



D-xylulose 



xylulokinase 
EC 2.7.1.17 



L-arabinose 




L-arabinose isomei-ase 
EC 5.3.1.4. 



L-ribulose 




ribulokinase 
EC 2.7.1.16 



L-ribulose 5 -phosphate 

L-ribulos ephosphate 
4-epimerase 
EC 5.1.3.4 

D-xylulose 5 -phosphate 




D-x>4ulose 5 -phosphate 



Figure 1 
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1 CTCAAACGCCTTGTTCGCCGGAGACCGCGCGCATTCACAGCTCGCCATGTCGCCTTCCGC 

1 M S P S A 

6 1 AGTCGATGACGCTCCCAAGGCCACAGGGGCAGCCATCTCAGTCAAGCCCAACATTGGCGT 

6 V DDAPKA TG A A ISV KPNIGV 

121 CTTCACAAATCCAAAACATGACCTCTGGATTAGCGAAGCTGAACCCAGCGCCGATGCCGT 

26 F T N P K H D L W I S E A E P S A D A V 

181 CAAATCTGGCGCTGATCTGAAGCCCGGCGAGGTGACCATTGCTGTCCGCAGCACTGGTAT 

46 K S G A D L K P GEV TI AVRSTGI 

241 CTGTGG GTATGTATAACGCTTCTGTCCACAGAGCGCAAGCGCAGAGGAGCAGCATGCTGA 

66 C G 

3 01 ACGAAATACGAATAG TTGAGATGTCCATTTCTGGCACGCCGGCTGCATTGGGCCCATGAT 

68 SD VHF W HAGCI GPMI 

3 61 CGTCGAGGGCGACCACATCCTCGGCCACGAGTCTGCCGGCGAGGTCATCGCCGTCCACCC 

83 V E G D H I L G H E S A G E V I A V H P 

421 GACTGTCAGTAGCCTCCAAATCGGCGATCGGGTTGCCATCGAGCCCAACATCATCTGCAA 

103 T V SSL Q I G D R V A I E P N I I C N 

481 CGCGTGCGAGCCCTGCCTGACAGGTCGATACAACGGCTGCGAAAAGGTCGAGTTCCTATC 

123 A C EPCLTGRYNGCEKVEFLS 

541 CACGCCGCCAGTGCCCGGACCGCTGCGACGCTACGTCAACCACCCAGCCGTTTGGTGCCA 

143 TPP VPGPLRR YV N HPAVWCH 

601 CAAGATTGGCAAC ATGTCGTGGGAGAACGGCGCGCTGCTGGAGCCCCTGAGCGTGGCTCT 

.V: 163 K I GNMSWENGALLEPLSVAL 

o o ... 

% *J 661 GGCCGGCATGC AGAGGGCC AAGGTTCAGCTCGGTGACCCCGTGCTGGTCTGCGGCGCTGG 

! v ; 183 AGMQRA KVQLGDPVLVCGAG 
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'•»]• 721 TCCGATTGGATTGGTGTCAATGCTGTGCGCTGCTGCCGCCGGTGCTTGCCCGCTTGTCAT 
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781 CACAGACATTTCAGAGAGCCGTCTGGCGTTTGCAAAGGAGATCTGCCCCCGGGTC ACC AC 
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841 GCACCGCATCGAGATTGGCAAGTCGGCTGAGGAAACGGCCAAAAGCATCGTCAGCTCTTT 

°..J 243 H R I E I G K S A E E T A K S I V S S F 
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V 901. TGGGGGCGTCGAGCCAGCCGTGACCCTGGAGTGCACCGGTGTGGAGAGCAGCATTGCAGC 

S°^S 263 GGV.EPAVTLECTGVESS I A A 

sVs 961 GGCCATCTGGGCCAGCAAGTTTGGAGGAAAGGTCTTTGTGATCGGCGTCGGCAAGAATGA 

283 AIWA. SKFGGKVFVIGVG KNE 



Figure 2, cont 



1021 AATCAGCATTCCCTTTATGAGGGCCAGTGTACGCGAGGTCGATATCCAGCTGCAGTATCG 

303 I S I PFMRASV.REV D IQLQY R 

1081 CTACAGCAACACCTGGCCTCGTGCCATCCGGCTCATCGAGAGCGGTGTCATCGATCTATC 

323 Y S N T W P R A I R L I E S G V I D L S 

1141 CAAATTTGTGACGCATCGCTTCCCGCTGGAGGATGCCGTCAAGGCATTTGAGACGTCAGC 

343 KFVTHRFPLEDA V K A F E T S A 

12 01 AGATCCCAAGAGCGGCGGC ATTAAGGTCATGATTC AGAGCCTGGATTGAGAGTGAGGTGC 

363 D P K S G A I K V M I Q S L D * 

12 61 TACC AGGTAGAGGTAGATAATAGATAGATGATGAAGATGGAAAGACTGCGGGCGCAAGAA 
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